Railroad accident analysis using extreme gradient boosting

نویسندگان

چکیده

• Derailments account for 70 % of the industry’s average annual accident cost. Machine learning identified factors that more strongly associate with derailments. Extreme gradient boosting outperformed 10 other types machine methods. Railroads are critical to economic health a nation. Unfortunately, railroads lose hundreds millions dollars from accidents each year. Trends reveal derailments consistently than U.S. railroad Hence, knowledge explanatory distinguish can inform cost-effective and impactful risk management strategies. Five feature scoring methods, including ANOVA Gini, agreed top four in type prediction were track class, movement authority, excess speed, territory signalization. Among 11 different algorithms, extreme method was most effective at predicting an area under receiver operating curve (AUC) metric 89 %. Principle component analysis revealed relative types, associated lower classes, non-signalized territories, authorizations within restricted limits. On average, occurred 16 kph below speed limit class whereas 32 limit. use integrated data preparation, learning, ranking framework presented gain additional insights managing risk, based on their unique environments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bioactive Molecule Prediction Using Extreme Gradient Boosting.

Following the explosive growth in chemical and biological data, the shift from traditional methods of drug discovery to computer-aided means has made data mining and machine learning methods integral parts of today's drug discovery process. In this paper, extreme gradient boosting (Xgboost), which is an ensemble of Classification and Regression Tree (CART) and a variant of the Gradient Boosting...

متن کامل

Predicting Customer Churn: Extreme Gradient Boosting with Temporal Data

Accurately predicting customer churn using large scale time-series data is a common problem facing many business domains. The creation of model features across various time windows for training and testing can be particularly challenging due to temporal issues common to time-series data. In this paper, we will explore the application of extreme gradient boosting (XGBoost) on a customer dataset ...

متن کامل

General Functional Matrix Factorization Using Gradient Boosting

Matrix factorization is among the most successful techniques for collaborative filtering. One challenge of collaborative filtering is how to utilize available auxiliary information to improve prediction accuracy. In this paper, we study the problem of utilizing auxiliary information as features of factorization and propose formalizing the problem as general functional matrix factorization, whos...

متن کامل

Cost efficient gradient boosting

Many applications require learning classifiers or regressors that are both accurate and cheap to evaluate. Prediction cost can be drastically reduced if the learned predictor is constructed such that on the majority of the inputs, it uses cheap features and fast evaluations. The main challenge is to do so with little loss in accuracy. In this work we propose a budget-aware strategy based on dee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Accident Analysis & Prevention

سال: 2021

ISSN: ['1879-2057', '0001-4575']

DOI: https://doi.org/10.1016/j.aap.2021.106126